A Survey on Text Classification Algorithms: From Text to Predictions

نویسندگان

چکیده

In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning algorithms leverage latest advancements deep methods, allowing for automatic extraction expressive features. The swift development these methods led to a plethora strategies encode natural language into machine-interpretable data. modelling are used conjunction with ad hoc preprocessing procedures, which description is often omitted favour more detailed explanation step. This paper offers concise review models, emphasis on flow data, from raw output labels. We highlight differences between earlier and recent, learning-based both their functioning how they transform input To give better perspective landscape, we provide an overview datasets English language, as well supplying instructions synthesis two new multilabel datasets, found be particularly scarce this setting. Finally, outline experimental results discuss open research challenges posed models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

A Survey of Text Classification Algorithms

The problem of classification has been widely studied in the data mining, machine learning, database, and information retrieval communities with applications in a number of diverse domains, such as target marketing, medical diagnosis, news group filtering, and document organization. In this paper we will provide a survey of a wide variety of text classification

متن کامل

A Survey on Feature Selection Techniques and Classification Algorithms for Efficient Text Classification

The rapid growth of World Wide Web has led to explosive growth of information. As most of information is stored in the form of texts, text mining has gained paramount importance. With the high availability of information from diverse sources, the task of automatic categorization of documents has become a vital method for managing, organizing vast amount of information and knowledge discovery. T...

متن کامل

Text Classification with Compression Algorithms

This work concerns a comparison of SVM kernel methods in text categorization tasks. In particular I define a kernel function that estimates the similarity between two objects computing by their compressed lengths. In fact, compression algorithms can detect arbitrarily long dependencies within the text strings. Data text vectorization looses information in feature extractions and is highly sensi...

متن کامل

A Survey of Text Clustering Algorithms

Clustering is a widely studied data mining problem in the text domains. The problem finds numerous applications in customer segmentation, classification, collaborative filtering, visualization, document organization, and indexing. In this chapter, we will provide a detailed survey of the problem of text clustering. We will study the key challenges of the clustering problem, as it applies to the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information

سال: 2022

ISSN: ['2078-2489']

DOI: https://doi.org/10.3390/info13020083